Goto

Collaborating Authors

 university assignment


r/MachineLearning - [P] Me and a friend made an online insult classifier as a university assignment. We could use your help estimating its real life usage accuracy.

#artificialintelligence

Basically we had to clean up about 4000 labelled examples of insult / not insult, vectorize it (just bag of words at the moment) and find a good classifier. What we found worked best was random forest and gradient boosting, with random forest rated at ROC 0.962 running on the web version at the moment. It's set up so one can grade each attempt and will log it accordingly. Since the data set is rather small I don't think the usual metrics can give a realistic result in terms of actual accuracy. But if we get it tested enough we should be able to estimate it from the received grades and also use those examples to further extend the data set.